HPB SMT of FRDC Assisted by Paraphrasing for the NTCIR-9 PatentMT

نویسندگان

  • Zhongguang Zheng
  • Naisheng Ge
  • Yao Meng
  • Hao Yu
چکیده

ABSTRACT This paper describes the FRDC machine translation system for the NTCIR-9 PatentMT. The FRDC system JIANZHEN is a hierarchical phrase-based (HPB) translation system. We participated in all the three subtasks, i.e., Chinese to English, Japanese to English and English to Japanese. In this paper, we introduce a novel paraphrasing mechanism to handle a certain kind of Chinese sentences whose syntactic component are far separated. The paraphrasing approach based on the manual templates moves far-separated syntactic components closer so that the translation could become more acceptable. In addition, we single parentheses out for special treatment for all the three languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ZZX_MT: the BeiHang MT System for NTCIR-9 PatentMT Task

In this paper, we describe ZZX_MT machine translation system for the NTCIR-9 Patent Machine Translation Task(PatentMT). We participated in the Chinese-English translation subtask and submit three results, which correspond to three different models or decoding algorithms respectively. Both of the first two are phrase-based SMT approaches integrating the BTG constraint into reordering models, and...

متن کامل

System Description of BJTU-NLP SMT for NTCIR-9 PatentMT

This paper presents the overview of statistical machine translation systems that BJTU-NLP developed for the NTCIR-9 Patent Machine Translation Task (NTCIR-9 PatentMT). We compared the performance between phrase-based translation model and factored translation model in our Patent SMT of Chinese to English and English to Japanese. Factored translation model was proposed as an extended phrase-base...

متن کامل

NTT-UT Statistical Machine Translation in NTCIR-9 PatentMT

This paper describes details of the NTT-UT system in NTCIR9 PatentMT task. One of its key technology is system combination; the final translation hypotheses are chosen from n-bests by different SMT systems in a Minimum Bayes Risk (MBR) manner. Each SMT system includes different technology: syntactic pre-ordering, forest-to-string translation, and using external resources for domain adaptation a...

متن کامل

BBN's Systems for the Chinese-English Sub-task of the NTCIR-9 PatentMT Evaluation

This paper describes the work we conducted for building a statistical machine translation (SMT) system for the ChineseEnglish sub-task of the NTCIR-9 patent machine translation (MT) evaluation [17]. We first applied the various techniques on patent data that we had developed for improving SMT performance on other types of data. Our results show that most of the techniques work on patent documen...

متن کامل

NTT-NII Statistical Machine Translation for NTCIR-10 PatentMT

This paper describes details of the NTT-NII system in NTCIR10 PatentMT task. The system is an extension of the NTTUT system in NTCIR-9 by: a new English dependency parser (for EJ task), a syntactic rule-based pre-ordering (for JE task), a syntax-based post-ordering (for JE task). Our system ranked 1st in EJ subtask both in automatic and subjective evaluation, and was the best SMT system in JE s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011